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SUMMARY 


Exact  tests  are  derived  for  testing  hypotheses  concerning  the 
variance  components  of  the  main  effects  in  an  unbalanced  random 
two-way  crossed  classification  with  interaction  model. The  tests 
are  based  on  four  sums  of  squares  which  are  distributed 
independently  as  scalar  multiples  of  chi-squared  variates.  These 
sums  of  squares  can  also  be  used  to  find  an  exact  test  concerning 
the  interaction  variance  component,  and  to  obtain  simultaneous 
confidence  intervals  on  all  continuous  functions  of  the  model's 
variance  components.  A  study  is  made  concerning  the  power  of  the 
proposed  tests,  including  a  comparison  with  other  approximate 
tests. 


Key  words:  Variance  components;  Unbalanced  random  model;  Two-way 

crossed  classification  with  interaction  model;  Hypothesis  testing 
Power  of  exact  tests. 


1 .  Introduction 

There  are  very  few  exact  tests  available  concerning  the  variance 
components  in  an  unbalanced  random  or  mixed  model.  This  is  mainly 
attributed  to  the  fact  that  in  an  unbalanced  data  situation,  the 
traditional  partitioning  of  the  total  sum  of  squares  does  not  in 
general  yield  independent  and  chi-squared  type  sums  of  squares. 
Furthermore,  such  partitioning  is  not  unique  as  is  the  case  with 
balanced  models. 

Wald  (1940,  1941)  was  the  first  author  to  introduce  exact 
testing  procedures  for  the  unbalanced  one-way  and  two-way  crossed 
classification  without  interaction  models.  A  generalization  of 
Wald's  tests  was  recently  discussed  by  Seely  and  El-Bassiouni 
(1983).  See  also  Harville  and  Fenech  (1985).  Using  two  different 
approaches,  Spj^tvoll  (1968)  and  Thomsen  (1975)  developed  exact 
tests  concerning  the  variance  components  in  an  unbalanced  random 
two-way  crossed  classification  with  interaction  model.  However, 
unless  the  interaction  variance  component  is  zero,  neither 
Spjifrtvoll's  tests  nor  those  of  Thomsen  can  be  used  to  make 
inferences  about  the  main  effects'  variance  components.  In  this 


paper  we  managed  to  overcome  this  difficulty  by  producing  exact 
test3  concerning  the  latter  variance  components  which  apply  even 
when  the  interaction  variance  component  is  different  from  zero.  A 
comparison  of  the  power  of  the  exact  tests  with  simulated  powers 
of  approximate  ANOVA-based  F  tests  shows  that  in  most  cases  the 


j  Avail  j'.c  ;  or 

,  -C  Ji 


□  G 


exact  tests  are  at  least  as  powerful  as  the  approximate  tests. 

One  main  disadvantage  of  the  latter  tests  is  that  their  true 

% 

critical  values  are  unknown  since  they  depend  on  variance 
components  other  than  those  under  consideration.  Thus,  in 
practice  the  critical  values  of  the  approximate  tests  must  be 
estimated  using  some  procedure  such  as  Satterthwaite ' s 
approximation.  The  simulation  study  shows  that  in  some  cases  this 
approximation  is  highly  unreliable  for  producing  the  critical 
values.  Only  in  such  cases  were  the  approximate  tests  observed  to 
be  more  powerful  than  the  exact  tests. 


2.  The  Development  of  the  Exact  Tests 


We  shall  adopt  the  same  notation  as  in  Thomsen  (1975).  Consider 


the  unbalanced  random  two-way  crossed  classification  model 


y  +  + 


(2.1) 


i  =  l,2,...,r;  j  *  1,2,. ..,s;  k  *  l,2,...,n^j,  where  y  is  an 

unknown  constant  parameter;  a  ,  S  ,  (ag)  ,  and  e. ..  are 

*  j  i  j  i  j 

independent  normally  distributed  random  variables  with  zero  means 


and  variances  c r*,  <j|,  o*g,  and  a2>  respectively.  Alternatively, 
(2.1)  can  be  written  in  matrix  form  as 


Z  m  u  in  +  Xl2  +  X2S  +  X3(oS)  +  e,  (2.2) 

•  • 

where  v  is  the  vector  of  observations  of  dimension  „  „  r  _ 

**  i,J  ij 

1  is  a  vector  of  ones  of  dimension  n  ,  X  ,X, ,  and  X,  are 


-3- 


matrices  of  zeros  and  ones  of  orders  n  xr,  n  *s,  and  n  xrs, 

•  •  •  •  •  • 

respectively.  The  variance-covariance  matrix  of  £,  denoted  by 
E,  is 


e  =*  x.x'T  o2  +  x,x;  o2  +  x.x;  a2 
-  -1-1  a  -2-2  3  -3-3  a8 


(2.3) 


where  I  is  the  identity  matrix  of  order  n  x  a 

~n  •  •  •  • 

•  • 

Let  y^j.  be  the  (i,j)Ch  sample  cell  mean  (i  ■  1,2,... ,r; 
j  *  1,2,. ..,s).  From  (2.1)  we  have 


yij.  *  “  *  “i  *  sj  *  <a8)ij 

i  »  l,2,...,r;  j  -  l,2,...,s,  where  eij . 
matrix  form,  (2.4)  may  be  written  as 


+  e 


ij 


nij 

k-1  eijk/nij* 


(2.4) 


In 


v  -  u  1  .  +  B,o  +  B.$  +  I  (a8)  +  e,  (2.5) 

**  -rs  -1-  —2—  -rs  ~  — 

where  B.  ■  I  0  1  ,  B0  ■  1  91,  and  0  is  the  direct  product 
— 1  ~r  — s  ~2  ~r  — s 

symbol.  The  variance-covariance  matrix  of  jr  is 

Var  y  »  A, a2  +  A-a2  +  I  a2.  +  K  a2, 

*  -la  -28  ~rs  a8  -  e 

where  A  *  B.B',  A0  *  B.B and  K  »  diag(n.|,  . n  l). 

-1  -1-1  -2  -2-2’  -  8V  11’  12’  ’  rs; 

Let  z  *  P  £,  where  J  is  an  orthogonal  matrix  of  order  rsxrs 

whose  first  row  is  (rs)^^'g  and  simultaneously  diagonalizes  ^ 

and  The  vector  z  can  be  partitioned  as  (z, ,z' ,z',z"  )'  , 

where  Zi  is  the  first  element  of  z;  z  ,z  _  and  z  are  vectors  of 
1  —  —a  —8  — a8 

dimensions  r-1,  s-1,  and  (r-l)(s-l),  respectively.  The  latter 


three  vectors  are  normally  distributed  with  zero  means  and  have 
the  following  variance-covariance  matrices  (see  Thomsen  1975, 


p.  259): 


Var  z  =  (so2  +  a2.)  I  ,  +  K,  a2 
-a  a  ct8  ~r-l  ~1  e 


Var  z  =  (ro2  +  a2  )  I  .  +  K  a2 
~3  3  a8  ~s-l  ~2  e 


Var  z  0  *  a2  If  .  +  K,  a2, 

~a8  aS  ~(r-l)(s-l)  ~3  e 


where  K, ,  K«,  and  K„  are  the  submatrices  of  PKP"  which  correspond 
~1  ~2  ~3  ~~~ 

to  z  ,  z„,  and  z  „ ,  respectively.  Let  u  be  the  vector 
-a  ~3  ~aS  ~ 


u  -  (z“,  z*,  z"  )'. 
~  ~a  ~S  ~aS 


(2.6) 


Then, 


Var  u  -  diag(<5 . 1  ,,5,1  5,1,  .  ,  J  +  L  a2,  (2.7) 

~  ov-  l~r-l  2~s-l  3~(r-l )(s-l )•'  ~  e’ 

where  5^,  5^,  and  5^  are  given  by 


h  ■  K  + 

52  ■  ra3  *  •is 

s3  •  °is> 


(2.8) 


and  h  is  the  (rs-l)x(rs-l)  submatrix  of  PKP'  which  corresponds  to 
u  and  is  expressible  in  the  form 


Ki  Kn  K-n 
~  L  ~ 1 2  —  13 


L  *  K'  K„„  , 

~  -12  -2  -23  ’ 


scr.  k;  k, 

3  **23  ^3 


(2.9) 


2  -  E(zfl  z'). 


where  K12<jJ  *  E(za  zg),  K^aJ  -  E(za  z^),  and  -  -v-g  ~ag 

The  matrix  L  is  of  rank  rs-1  and  is,  therefore,  nonsingular. 

The  random  vectors  z  ,  z„,  and  z  a  are  not  independent. 
However,  they  are  independent  of  the  error  sum  of  squares. 


«  ‘  1,1,1c  (rijk-  7lj-)2 


(2.10) 


which  can  also  be  written  as 


Z'l 


(2.11) 


where  y  is  the  vector  of  observations  and  R  is  the  n  xn  matrix 

**  •  •  •  • 


r  -  r 

-  ~n 


,®,  (J  /n,.). 
i.j  ij 


(2.12) 


In  (2.12)  is  the  matrix  of  ones  of  order  n^xn^ 


( i*l ,2 , . . . ,r;  j*l ,2, . . . ,s)  and  the  second  term  is  the  direct  sum 


of  the  /n^'s.  It  is  easy  to  verify  that  R  is  idempotent  of 


f  j 


rank  n  -  rs  and  that 


R  X.  -  R  X,  -  R  X,  -  0,  (2.13) 

where  X.,  X, ,  and  X,  are  the  matrices  of  zeros  and  ones  in 
~1  ~2  ~3 

(2.2).  Hence,  Q/j£  has  a  chi-squared  distribution  with  n  -  rs 
degrees  of  freedom  (see  also  Thomsen  1975,  p.  260). 

Since  R  is  symmetric  it  can  be  written  as 

R  *  C  A  C",  (2.14) 


where  £  is  an  orthogonal  matrix  and  A  is  a  diagonal  matrix  of 


eigenvalues  of  R,  both  of  order  n^xn 


We  shall  assume  char 


n  >  2rs  -  1. 

•  • 


(2.15) 


This  is  not  an  unreasonable  assumption  and  can,  for  example,  be 
satisfied  if  each  cell  contains  at  least  two  observations.  In 
this  case  and  because  §  is  idempotent  of  rank  n#>-  rs  >  rs  -  1,  A 
and  £  can  be  partitioned  as 


where 


A  =  diag(l  ,  I  ,  0) 
1  2 

fi  -  CSi*  C2:  C3], 


(2.16) 


*  rs  -  1 

(2.17) 

v-  »  n  -  2rs  +  1, 

0  is  a  zero  matrix  of  order  rsxrs,  and  C.,  C_,  C~  are  of  orders 

*•1  'S#J 

n  xy  n  xv  and  n  xrs,  respectively.  Note  that 

•  •  1  •  •  2  •• 


S  £i  ”  b  1  *  i>2«3> 

£1  £j  -  1  *  > 

Formula  (2.14)  can  then  be  rewritten  as 

r  -  c.  cr  +  c,  c;. 

From  (2.11)  and  (2.19),  the  error  sum  of  squares  Q  can  be 
partitioned  as 


(2.18) 


(2.19) 


Q  *  Ql  +  Q2> 

where 


(2.20) 


*1  *  *  £l  £1  X 


(2.2 


(2.22) 


q2  =  t  £2  £2  C2*2: 

The  sums  of  squares  and  C^,  and  Che  random  vector  u  in  (2.6) 
are  independent.  Furthermore,  Q^/a2  and  ^^e  have  tiie  chi- 
squared  distribution  with  and  ^  degrees  of  freedom,  respec¬ 


tively. 


We  now  define  the  random  vector  d  as 

H 

u  -  U  +  (X  I  -  L)  C?  y, 
~  ~  max  ~v^  ~  ~1  ^ 


(2.23) 


where  L  is  the  matrix  in  (2.9)  and  ^max  is  its  largest  eigenvalue. 

The  matrix  X  I  -  L  is  positive  semidef inite ,  hence  the  matrix 
max  ~v  ~ 

(X  I  -  L)  is  well  defined  with  eigenvalues  equal  to  the 

max~Vj  ~ 

square  roots  of  the  eigenvalues  of  ^nax  “  L.  Let  d  be 
partitioned  just  like  u  in  (2.6)  as 


(1)  ■  (u»",  in',  m".)", 
~a’  ~8  ~a8  ’ 


(2.24) 


where  the  vectors  d  ,  d„,  and  d  are  of  dimensions  r-1,  s-1,  and 

~a  ~S  ~a6 

(r-l)(s-l),  respectively. 


Lemma  1 


(i)  Ecu  *  Em  =  Ed  .  *  0. 
~a  ~S  ~o6 


(ii)  d^,  uig,  and  D^g  are  independent  normally  distributed 
random  vectors  and  have  the  following  variance- 
covariance  matrices: 


Var  id  =*  (so2  +  a2  +  X  a2)  I  ,  , 

3  38  max  e  ~r- 1 

Var  Da  »  (ra2  +  a2  +  X  u2)  I  ,, 

~3  3  a8  max  e  ~s-l’ 


(2.25) 


Var  d  - 
~a3 


( a1  +  X  a2)  I.  . .  .  . 

38  max  e  ~(r-i)(s-l) 


-3- 


(iii) 


a)  ,  and  are  independenc  of  C^,  where  Q2  is  the 
sum  of  squares  in  (2.22). 

Proof,  (i)  From  (2.12)  it  can  be  seen  that  R  1  =0.  Thus, 

•  • 

by  (2.19)  we  can  write 


(c.c;  +  c,c:)  i  =  o.  (2.26) 

~1~1  ~2~ 2  ~n  ~ 

•  • 

Using  (2.18)  in  (2.26)  we  get  C'T  1  =0.  It  follows  that 

~1  ~n  ~ 

•  • 

E ( =■  C'  1^  u  =  0.  Since  £  in  (2.6)  has  also  a  zero  mean,  we 

•  • 

conclude  that  the  mean  of  m  in  (2.23)  is  zero. 

(ii)  It  is  obvious  that  m  in  (2.23)  is  normally  distributed.  We 

now  claim  that  y  is  independent  of  C'£.  To  show  this  we  note  that 

the  vector  jr  in  (2.5);  which  can  be  written  as  v  =  Dj£,  where 

D  *  f  1“  /n  ;  Is  independent  of  Q  and  hence  of  Q,  (see  2.20  and 
i,j  ^ij 

2.21).  Consequently, 


2  l  £X£I  *  0,  (2.27) 

where  Z  is  the  variance-covariance  matrix  of  £  given  in  (2.3)  (see 
Searle  1971,  p.  59).  From  (2.18)  and  (2.27)  we  conclude  that 


D  Z  C  -  0.  (2.28) 

^  ~  —* 

Hence,  Cov(v,y,'C  )  3  D  I  C  =0.  Since  u  is  a  subvector  of 
2  =*  P  v,  then  y  is  independent  of  as  was  claimed. 

The  variance-covariance  matrix  of  to  in  (2.23)  can  then  be 
written  as 


Var  co  =  Var  u  +  (X  I  -  L)^ 
~  ~  max~v^  ~ 

But  from  (2.13)  and  (2.19)  we  have 


C.'EC. 


(X  I  - 
max~v^ 


(2.29) 


(c.ct  +  c,c;)  X  -  0,  i  *  1,2,3. 

~1~1  ~2~2  ~i  ~ 


(2.30) 


Using  (2.18),  equalities  (2.30)  yield 

~1  ~i  =  1  “  I'2*3*  (2.31) 

It  follows  that 


Cl  E  C.  =  cr  C,  a2  =  I  a2. 
-  ~i  ~1  e  ~v^  e 

From  (2.7),  (2.29),  and  (2.32)  we  then  obtain 


(2.32) 


Var  cu  *  diag(  5 .  I  .,6.1  .,5,1.  ...  ,  +  La2  +  (X  I  -  L)o2, 

~  ^  l~r-l  2~s-l  3~(r-l)(s-I)'  ~  e  max-v^  ~  e 

that  is, 

V.r  .  -  diag((5l*  .  (62*  ^lns.r 

<V  Wi>I(r-U(S-u’-  C2'33) 

From  (2.33)  we  conclude  that  oj  ,  uj_,  and  u  „  are  independent  and 

~a  ~ag 

have  the  variance-covariance  matrices  described  in  (2.25). 

(iii)  Q2  is  independent  of  y  (since  Q  is)  and  is  also  independent 
of  Z\Z  since  £'  E  =  0,  which  follows  from  (2.18)  and  (2.31) 
after  noting  the  formula  for  £  in  (2.3).  Thus,  Q2  is  independent 

Of  !*}. 

From  lemma 1  we  conclude  chat  the  sums  of  squares, 

ScT  ~a~a’  s8  "  ~S~S’  Sa8  "  ^aS-ctS  ’  and  Q2  are  distributed 
independently,  and 


S  /(so2  +  a2  +  X  a2 )  ~  x2 
a  a  a8  max  e  r- 1 

Sfl/(ro2  +  o2  +  X  o2 )  ~  y2 
8  0  a3  max  e;  xs-l 


-10- 


mJL  -  A  *  ^r- 


where  v2  is  given  in  (2.17).  A  tesc  statistic  for  testing  the 

hypothesis  Hq:<j2  =  0  vs.  Ha :  *  0  is,  therefore,  F  =  MS^/MS^, 

where  MS  *  S  / ( r— 1 )  and  MS  -  *  S  /  (r-l)(s-l).  Under  Hn  this 
a  a  ap  as  u 

statistic  has  the  F  distribution  with  r-1  and  (r-l)(s-l)  degrees 
of  freedom.  The  hypothesis  Hq  can  be  rejected  at  the  a-level  of 

significance  if  F  2.  Fa,r-1  (r-l)(s-l)’  che  uPPer  point  of 

the  corresponding  F  distribution.  Similarly,  to  test  the 

hypothesis  HQ:o^  =  0  vs.  Ha:<j2  t  0,  we  use  the  statistic 

F=MSg/MSag,  where  MSg  =  Sg/(s-l).  Furthermore,  the  statistic 

F»(v2/Xmax) (MS^g/Q^)  can  be  used  to  test  the  hypothesis  Hgicr*  *  ° 

vs.  *  0.  We  do  not,  however,  recommend  using  this  test 

since  it  has  fewer  denominator  degrees  of  freedom  than  the  exact 

tesc  for  a2,  given  by  Thomsen  (1975). 
ap 

We  note  that  if  the  data  set  is  balanced,  then  g  = 

diag(n^ , . . . ,n^g )  ■  I  /n,  where  n  is  the  number  of  observations 

per  cell.  Hence,  PKP~*  ■  I  /n  and  L  *  I  /n,  where  v,  is  given  in 
~~~  ~rs  ~  1 

(2.17).  Consequently,  ^max  m  1/n  and  the  vectors  and  £  in 
(2.23)  become  identical.  Furthermore,  the  sums  of  squares  nSa, 
nSg,  and  nSa8  reduce  to  the  balanced  ANOVA  sums  of  squares 
associated  with  the  main  and  interaction  effects,  respectively. 

The  following  lemma  is  useful  for  the  power  study  in  Section 
5  and  is  proved  in  the  Appendix: 


Lemma 2 


cable: 


288 

Family  number 

352  19  141 

60 

.804 

.734 

.967 

.917 

.850 

34 

.967 

.817 

.930 

.970 

.833 

.889 

.304 

.867 

.407 

.896 

.952 

.486 

35 

.667 

.511 

.717 

.467 

.793 

.274 

.458 

.428 

Test  number 

.409 

.411 

.919 

.408  .275 

40 

.569 

.646 

.669 

.435  .256 

.715 

.310 

.669 

.500 

.487 

.450 

.587 

.304 

.928 

.367  .525 

41 

.538 

.428 

.855 

.961 

.655 

.300 

.800 

We  analyze  variation 

due  to 

family 

and  test 

according 

to  the  model 

in  (2.1),  where  yijk 

*  arcsin-square  root  of 

the  kth 

observed 

proportion  in  family 

i  and 

test  j. 

The  exact  test  will  be  used  to 

test,  for  example,  the  null 

hypothesis  H^: 

a2  ■  0 
a 

regarding  the 

family  variance  component. 


The  first  step  is  to  obtain  the  matrix  P.  This  can  be  done 
using  the  algorithm  given  by  Graybill  (1983,  p.  406). 
Alternatively,  P  may  be  constructed  with  rows  2  through  20  given 
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by  secs  of  normalized  orthogonal  contrasts  for  a  effect,  S  effect 
and  a8  effect  for  balanced  data.  Computation  of  other  components 
of  u>  in  (2.23)  are  straightforward.  For  the  present  example,  we 


obtain  MSa/MSag  =  0.1543/0.0415  *  3.718,  which  has  an  observed 
significance  level  of  0.034. 

4.  Simultaneous  Confidence  Intervals  on  the  Variance  Components 


One  of  che  Interesting  consequences  of  Lemmal  is  that  simultaneous 

confidence  intervals  on  all  continuous  functions  of  a2,  o2,a2a, 

a  o  ao 

and  a2  can  be  as  easily  obtained  as  in  a  balanced  data 
e 

situation.  To  see  this,  let  us  denote  the  expected  values  of  MSa, 
MSg,  MSa6,  and  Q,/v9  by  Tft,  T^ft,  and  Ta>  respectively. 


a*  8*  aS1 


t  »  sa2  +  a2  +  X  a2, 

a  a  ag  max  e 

t a  *  ro2  +  a2  +  X  or2, 

o  3  a8  max  e  * 

T  a  “  O2 0  +  X  O2, 
ao  a6  max  e 


(4.1) 

(4.2) 

(4.3) 

(4.4) 


Individual  (l-a)100%  confidence  intervals  on  Ta,  tg,  Tag,  and  t( 
are,  respectively 

Ca  *  Sa/xa/2,r-l  *  Ta  <  Sa/xl-a/2 ,r-l ^ 

C3  *  SS/xa/2,s-l  <  T3  <  SB/xl-a/2,s-l ^ 


Ca8  *  ^  TaS  :  SaS/xa/2,(r-l)(s-i)  *  TaS  *  Sa8/xl-a/2, (r-l)(s-l) ^ 


Ce  «  {%  *  Q2/xK/2,v2  «  Te  *  Q2/xl-a/2,v^’  where  xa,m  denotes 
the  upper  a!00%  point  of  the  chi-squared  distribution  with  m 
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degrees  of  freedom.  Since  Sa,  Sg,  Sag,  and  Q2  are  independently 

distributed,  the  Cartesian  product,  C  *  Ca  x  cg  *  Cag  x  CQ, 

represents  an  exact  rectangular  confidence  region  on  t  *  (Ta»Tg» 

r  ,t  )"  with  a  confidence  coefficient  equal  to  1-a  *  ( 1-a)  . 

ao  e 

Let  us  now  suppose  that  y  -  f(o2,  tJ2,  a2  ,  a2  )  is  a 

a  o  no  e 

continious  function  of  the  variance  components.  This  function  can 

be  expressed  as  ^  »  g  (t  ,  t.,  t  i  ),  where  g  is  obtained  from 

a  p  Op  e 

f  by  substituting  the  variance  components  by  ra,  tg,  tag,  using 
equations  (4.1)  -  (4.4).  By  the  mechod  described  in  Khuri  (1981) 
for  balanced  data,  the  interval 

B  -  {Y  :  mig  g(T)  <  y  *  mag  g  (r)} 
is  a  confidence  interval  on  y  with  a  confidence  coefficient 
greater  than  or  equal  to  1-a*.  Furthermore,  if  g  belongs  to  a 
family  G  of  continuous  functions  of  T0,  Tg,  Tog,  re,  then 

p[x  e®  ,  Vg  e  g]  >  l  -  ®  . 

s 

Thus,  for  geG,  the  intervals  B  are  conservative  simultaneous 

© 

confidence  intervals  on  the  values  of  all  continuous  functions  of 
the  variance  components  for  model  (2.1). 

5.  The  Power  of  the  Exact  Tests 

Power  values  for  each  of  the  exact  tests  described  in  Section  2 

can  be  easily  computed  using  the  F  distribution.  We  shall  only 

consider  the  power  of  the  test  concerning  o^.  A  similar  power 

study  can  be  made  regarding  the  test  for  a2. 

3 

Let  f  denote  the  power  of  the  test  for  a^.  Then 


a,  r-1,  (r-D(s-l) 


I 


t  “  rj_ 


MS 


aS 


where  Ha  :  a £  *  0.  Under  Ha, 


MS 


a2  +  X  a2 


a  _aS  _  max  e 

iz  +  az  „  +  X  S* 
a  a 6  max  e 


MS 


ae 


Hence,  (5.1)  can  be  rewritten  as 


,  J  » 


r-1,  (r-l)(s-l)’ 


f  *  P^Fr-l,(r-l)(s-l)  >  1  +  s9  Fa,r-1 ,(r-l)(s-l)^ *  (5*2) 


where 


9  •  a2/(cr2  +  X  a2), 

a  aS  max  e 


From  (5.2)  it  can  be  seen  that  f  is  a  function  of  the  level  of 

significance,  a,  X  ,  which  depends  on  the  design  used,  and  the 
max 

2  2  2  2 

variance  ratios  ae  through  The  latter  variance 

ratio  is  considered  a  nuisance  parameter.  Since  f  is  a  monotone 

increasing  function  of  9,  it  follows  that  f  is 

2  2 

i)  a  monotone  increasing  function  of  for  a  fixed  value 

2  2 

of  and  a  fixed  design. 

ii)  a  monotone  decreasing  function  of  the  nuisance  parameter 

a£g/<Jg  for  a  fixed  value  of  and  a  fixed  design. 

iii) a  monotone  decreasing  function  of  X  for  fixed  ratios  of 

max 

the  variance  components.  Consequently,  if  n. . ,  the  total 
of  the  cell  frequencies,  is  fixed,  then  by  Lemma2  higher 
power  values  are  expected  for  smaller  values  of  d,  where  d 


1 


( 


I _ L  y  _  . 

1 )  rs  .  u .  n 


n  ’  i,j  ij 

For  a  balaned  data  sec,  d  *  0  and  maximum  power  is  achieved.  We 

can,  therefore,  regard  Che  quantity  n^^d’,  which  belongs  Co  the 

interval  [0,1),  as  a  measure  of  imbalance.  Small  values  of  this 

measure  are  associated  with  designs  that  are  nearly  balanced.  For 

a  more  general  discussion  concerning  measures  of  imbalance  for 

unbalanced  models,  Che  interested  reader  is  referred  to  Khuri 


(1986). 

6.  A  Power  Comparison  With  Other  Approximate  Tests 
In  this  section,  we  compare  the  power  of  the  exact  test  statistic, 
MSa/MSag,  given  by  formula  (5.1)  with  powers  of  tests  that  are 
most  commonly  used  in  practice,  namely,  the  ANOVA-based 
approximate  F  tests. 

There  are  several  analyses  of  variance,  each  using  a 
different  method  of  computing  sums  of  squares.  Two  of  these 


methods,  expressed  in 

'reduction  in  SS" 

notation,  are 

s 

Source  of 
Variation 

Degrees  of 
Freedom 

Type  I 

SS 

Type  I I 

SS 

A 

r-l 

R(o|  w) 

R( a j  u  ,  3 ) 

B 

s-1 

R(S(y,a) 

R( 3  j  u ,a) 

A*B 

(r-l)(s-l) 

R(aS | u ,a,3) 

R(a8 1 u ,a, 3 ) 

Residual 

n..  -  rs 

Q 

Q 

See  Searle 

(1971,  Section  6.3)  for  a  description  of  the 

"reduction' 

notation. 

The  terminology. 

"Type  I"  and 

"Type  II" ,  is 
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consistent  with  that  of  the  SAS  (1982)  System  of  statistical 

2 

software.  An  approximate  F  test  statistic  for  HQ:  *  0  based  on 

* 

the  Type  i  SS  (i  *  I,  i  -  II)  has  the  form  F(i)  -  MSa(i)/MSag(i) . 

The  numerator,  MSa(i),  is  the  Type  i  mean  square  for  a.  The 
* 

denominator,  MSag(i),  is 

MSas(i)  *  al  +  ki(i)  aie  +  k2(i)  ab 

where  k^(i)  and  k£(i)  are  c^e  coefficients  in  the  expected  mean 


square 


E[MS  (i)]  -  a2  +  k  (i)  a2  +  k  (i)  a2  +  k  (i)  a2 
a  e  1  ag  2  B  j  a 


and  a2,  a2  ,  and  o2  are,  respectively,  the  analysis  of  variance 
e  ag  3 

2  2  2  i 

estimators  of  o^,  o£g,  and  <Jg,  based  upon  Q,  R(ag|u,a,g) , 

and  R(g|u,a).  (Note:  k2(li)  -  0). 

Powers  of  the  approximate  test  statistics  F(I)  and  F(II)  were 

estimated  via  computer  simulation.  The  simulation  study  required 

two  steps;  the  first  to  estimate  critical  values  of  F(I)  and  F(II) 

2  2 

under  Hq:  <j£  *  0;  the  second  to  estimate  the  power  for  >  0. 

All  simulations  were  conducted  using  PROC  MATRIX  of  the  SAS 
(1982)  System.  The  SAS  functions  RANNOR  and  RANGAM  were  used  to 
generate  pseudo-random  normal  and  chi-squared  variates, 
respectively.  Powers  were  estimated  for  25  combinations  of  values 
of  the  variance  components  and  six  n^  patterns,  making  150  cases 
in  all.  Without  loss  of  generality,  a,  *  1.0  was  used  in  all 
combinations.  Values  of  <?ag  and  crg  constituted  a  "response 
surface  design"  containing  a  2x2  factorial  and  an  interior  point. 
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namely,  (0.2, 0.2),  (0.2, 5.0),  (5. 0,0. 2),  (5.0, 5.0)  and 

(1.0, 1.0).  For  each  of  these  five  combinations,  five  values  of 


oa  *  0.2,  0.5,  1.0,  2.0  and  5.0  were  considered  to  produce  the  25 
combinations  of  <?ag .  ffg,  and  o^. 

The  six  n„  patterns  contained  three  "near  balance”  patterns 
(NB)  and  three  "highly  unbalanced”  patterns  (HU),  each  containing 
5*5,  5x10  and  10x5  arrays.  The  six  patterns,  with  rows 
representing  levels  of  factor  A  and  columns  representing  levels  of 
factor  B,  are: 


NB 

(near  balance) 


5  5  5  5  6 

r-5  4  4  6  4  5 

6  4  4  4  4 

s-5  4  6  5  5  6 

6  5  4  5  6 


HU 

(highly  unbalanced) 


9  2  9  1  2 

10  1  2  9  10 

18  12  2 
9  10  1  9  3 

8  3  2  10  1 


5  5  5  5  6 

4  4  6  4  5 

6  4  4  4  4 

r-10  46556 

6  5  4  5  6 

s-5  5  5  5  5  6 

4  4  6  4  5 

6  4  4  4  4 

4  6  5  5  6 

6  5  4  5  6 


9  2  9  1  2 

10  1  2  9  10 

18  12  2 
9  10  1  9  3 

8  3  2  10  1 

9  2  9  1  2 

10  1  2  9  10 

18  12  2 

9  10  1  9  3 

8  3  2  10  1 


5  5  5  5  6  5  5  5  5  6 
r-5  4464544645 

6446464464 
i-10  4655646556 

6545665456 


9291292912 
10  1  2  9  10  10  1  2  9  10 

18  12  2  18  12  2 

9  10  1  9  3  9  10  1  9  3 

8  3  2  10  1  8  3  2  10  1 


The  distributions  of  F(I)  and  F(II)  depend  on  the  true  values 

2 

of  o^g  and  <Jg,  even  under  the  null  hypothesis  Hg:  *  0. 

Therefore,  it  was  necessary  to  estimate  the  critical  values  of 
F(I)  and  F(II)  for  all  values  of  <Jag  and  a g,  and  all  patterns 

involved  in  the  power  study.  This  was  done  as  follows:  For  each 
of  the  five  combinations  of  aa g  and  Ug  and  each  of  the  six  nij 
patterns  (thirty  cases  in  all),  1000  sets  of  cell  means 
and  Q  values  were  generated  according  to  the  model 
y^  *  u  +  +  (aB)^  +  eij  »  where  3j*  (“8)^,  and  e^  ^  are 

independently  distributed  as  normal  variates  with  zero  means  and 
variances  Og ,  oag,  and  respectively,  Q/o^  has  the  chi- 

squared  distribution  with  n. .-  rs  degrees  of  freedom,  and,  without 
loss  of  generality,  u  *  0.  (Note  the  absence  of  oi  in  the  model, 
corresponding  to  oa  -  0).  For  each  set  of  y^j  ^  and  Q  values,  F(I) 
and  F(II)  were  calculated,  and  the  95Z  sample  quantiles  of  F(I) 
and  F(II)  were  recorded  from  the  1000  sets.  This  process  was 
repeated  ten  times,  and  the  mean  and  standard  deviation  of  the  ten 
F(I)  and  F(II)  quantiles  were  computed  to  estimate  the  upper  5Z 
critical  values  for  F(I)  and  F(II).  These  are  reported  in  Tables 
6.1  and  6.2. 

The  estimated  critical  values  in  Tables  6.1  and  6.2 
demonstrate  the  degree  of  dependence  of  the  null  distributions  of 
F(I)  and  F(II)  upon  the  nuisance  parameters,  o  and  a,. 


The  most 


K*. 


serious  disturbance  of  the  distributions  is  for  small  values  of 
o3g  (*0.2),  especially  in  "highly  unbalanced"  cases  (Table  6.2). 

In  practice,  the  true  critical  values  of  F(I)  and  F(II)  would 
not  be  available  because  oag  and  Jg  are  not  known.  Instead,  the 
calulated  values  of  F(I)  and  F(II)  would  typically  be  referred  to 
an  F  distribution  with  denominator  degrees  of  freedom  given  by  a 
Satterthwaite-type  approximation  such  as  illustrated  by  Milliken 
and  Johnson  (1984,  Section  20.1.2).  Actual  Rejection 
probabilities  (type  1  error  rates)  corresponding  to  a  nominal 
a  »  .05  for  F(I)  and  F(II)  using  these  approximate  degrees  of 
freedom  were  estimated  in  the  simulation  study.  These  are  also 
reported  in  Tables  6.1  and  6.2.  The  results  show  that  the 
Satterthwaite-type  approximate  procedures  produce  true  type  1 
error  rates  that  are  far  less  than  the  nominal  .05  for  some  cases, 
particularly  those  with  small  values  of  aag  in  the  highly 
unbalanced  situation. 

Estimation  of  power  for  the  statistics  F(I)  and  F(II) 

followed  a  process  similar  to  that  used  to  estimate  the  critical 

values.  For  each  of  the  25  selected  combinations  of  a„a,  <30  and  o 

op  ’  3  o 

and  each  of  the  six  n^  patterns,  2000  sets  of  cell  means  y^j  and 
Q  values  were  generated  according  to  model  (2.4)  with  u  taken 
equal  to  zero.  The  statistics  F(I)  and  F(II)  were  calculated  for 
each  set  of  Fij .  and  Q  values.  The  proportion  of  times,  out  of 
the  2000,  that  F(I)  and  F( II)  exceeded  the  estimated  critical 
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values  in  Tables  6.1  and  6.2  was  computed.  These  proportions  are 

o 

estimates  of  the  powers  of  F(I)  and  F(II)  for  testing  Hq:  a £  *  0, 
and  are  recorded  in  Tables  6.3  and  6.4.  Powers  of  the  exact  test 
statistic  MSa/MSag  are  also  reported  in  the  latter  tables.  These 
results  show  that,  with  a  few  exceptions,  the  power  of  the  exact 
test  is  better  or  essentially  as  good  as  the  power  of  either  of 
the  approximate  procedures.  The  exceptions  are  for  small  values 
of  (0.2  and  0.5)  and  small  values  of  a^g  (0.2). 

It  must  be  remembered  that  the  approximate  tests  whose  powers 
are  shown  in  Tables  6.3  and  6.4  could  not  be  computed  in  practice 
because  their  critical  values  depend  on  the  unknown  cr^g  and  <jg. 

The  dependence  is  most  severe  for  small  values  of  oa g.  These  are 
the  same  values  of  aflg  for  which  the  power  of  the  exact  test  was 
inferior  to  the  approximate  tests.  Therefore,  the  power  of  the 
exact  test  appears  to  be  generally  as  good  or  better  than  powers 
of  the  approximate  tests  except  in  cases  for  which  valid  critical 
values  of  the  approximate  tests  are  most  unreliable. 
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APPENDIX 
Proof  of  Lemma2 

— Ij 

Consider  the  orthogonal  matrix  P  whose  first  row  is  (rs)  Kg 
and  simultaneously  diagonalizes  A.  *  B.B',  A-  -  B_B1,  where  B  and  B» 
are  the  matrices  in  (2.5).  Let  P^  be  the  submatrix  of  P  obtained 
by  deleting  the  first  row.  It  is  easy  to  verify  that 


p.p; 

— i~i 


’rs-i’ 


p?p,  +  —  j  -  i  , 
-1-1  rs  -rs  -rs 


(A.  1 ) 
(A. 2) 

^max 


Now,  X  is  the  largest  eigenvalue  of  L  •  P.KP',  that  is, 

UMA  *• 

emax(PjKP^  If  n^^  is  the  smallest  cell  frequency,  then  the 
matrix  (l/n^)P.P'T  -  P.KP'T  is  positive  semidef inite.  Using 
(A.l)  we  get 
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Table  6.1 

Estimated  95%  quantiles  of  F(I)  and  F(II),  and  estimated  type  1 
error  rates  for  F(I)  and  F(II)  using  Satterthwaite 's  approximate 
degrees  of  freedom  for  a  nominal  a  »  .05  for  "near  balance”  cases 

Design  o^g  <Jg  95%  Quantiles  Type  1  Error  Rates 


Table  6.3  (continued) 

Design  oa8  ag 


0.2 

0.5 

1.0 

2.0 

5.0 

I 

0.416 

0.920 

0.991 

0.999 

1.000 

5x10 

0.2 

0.2 

II 

0.410 

0.920 

0.991 

0.999 

1.000 

E 

0.398 

0.907 

0.991 

0.999 

1.000 

I 

0.182 

0.630 

0.930 

0.995 

0.999 

5x10 

0.2 

5.0 

II 

0.428 

0.923 

0.992 

1.000 

1.000 

E 

0.398 

0.907 

0.991 

0.999 

1.000 

I 

0.119 

0.485 

0.887 

0.988 

0.998 

5x10 

1.0 

1.0 

II 

0.117 

0.492 

0.895 

0.988 

0.998 

E 

0.117 

0.493 

0.884 

0.988 

0.999 

I 

0.057 

0.062 

0.125 

0.392 

0.912 

5x10 

5.0 

0.2 

II 

0.055 

0.061 

0.128 

0.392 

0.912 

E 

0.052 

0.067 

0.133 

0.409 

0.912 

I 

0.057 

0.074 

0.136 

0.415 

0.919 

5x10 

5.0 

5.0 

II 

0.055 

0.075 

0.136 

0.418 

0.924 

E 

0.052 

0.067 

0.133 

0.409 

0.912 
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Table  6.4 

Estimated  powers  of  F(I)  and  F(II)  and 
exact  powers  (E)  of  MSa/MSag  for  "highly  unbalanced"  cases. 

Design  aaS 


0.2 

0.5 

1.0 

2.0 

5.0 

I 

0.142 

0.529 

0.895 

0?980 

0.990 

5x5 

0.2 

0.2 

II 

0. 126 

0.498 

0.873 

0.976 

0.987 

E 

0.101 

0.413 

0.837 

0.981 

0.999 

I 

0.063 

0.127 

0.296 

0.602 

0.927 

5x5 

0.2 

5.0 

II 

0.132 

0.518 

0.881 

0.975 

0.987 

E 

0.101 

0.413 

0.837 

0.981 

0.999 

I 

0.067 

0.179 

0.511 

0.685 

0.995 

5x5 

1.0 

1.0 

II 

0.059 

0.178 

0.520 

0.901 

0.998 

E 

0.069 

0.198 

0.578 

0.919 

0.996 

I 

0.040 

0.054 

0.070 

0.154 

0.606 

5x5 

5.0 

0.2 

II 

0.036 

0.062 

0.077 

0.154 

0.609 

E 

0.051 

0.057 

0.082 

0.201 

0.727 

I 

0.059 

0.055 

0.065 

0.136 

0.551 

5x5 

5.0 

5.0 

II 

0.051 

0.044 

0.065 

0.150 

0.609 

E 

0.051 

0.057 

0.082 

0.201 

0.727 

I 

0.218 

0.853 

0.997 

1.000 

1.000 

10x5 

0.2 

0.2 

II 

0.231 

0.854 

0.996 

1.000 

1.000 

E 

0.136 

0.667 

0.983 

0.999 

1.000 

I 

0.063 

0.144 

0.363 

0.713 

0.989 

10x5 

0.2 

5.0 

II 

0.227 

0.861 

0.998 

1.000 

1.000 

E 

0.136 

0.667 

0.983 

0.999 

1.000 

I 

0.078 

0.261 

0.736 

0.991 

1.000 

10x5 

1.0 

1.0 

II 

0.081 

0.294 

0.794 

0.996 

1.000 

E 

0.079 

0.314 

0.844 

0.996 

1.000 

I 

0.052 

0.056 

0.088 

0.216 

0.854 

10x5 

5.0 

0.2 

II 

0.051 

0.054 

0.085 

0.214 

0.852 

E 

0.051 

0.060 

0. 100 

0.317 

0.943 

I 

0.046 

0.081 

0.080 

0. 182 

0.792 

10x5 

5.0 

5.0 

II 

0.052 

0.070 

0.091 

0.220 

0.852 

E 

0.051 

0.060 

0. 100 

0.317 

0.943 
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Table  6.4  (continued) 


Design 

aaS 

a3 

0.2 

0.5 

1.0 

2.0 

5.0 

I 

0.330 

0.865 

0.983 

0.999 

1.000 

5x10 

0.2 

0.2 

II 

0.311 

0.860 

0.985 

0.999 

1.000 

E 

0.190 

0.711 

0.958 

0.996 

0.999 

I 

0.061 

0.107 

0.292 

0.668 

0.966 

5xio 

0.2 

5.0 

II 

0.289 

0.856 

0.984 

0.999 

1.000 

E 

0.190 

0.711 

0.958 

0.996 

0.999 

I 

0.087 

0.311 

0.754 

0.967 

0.999 

5x10 

1.0 

1.0 

II 

0.095 

0.373 

0.801 

0.975 

0.999 

E 

0.099 

0.406 

0.833 

0.981 

0.999 

I 

0.050 

0.053 

0.095 

0.303 

0.814 

5x10 

5.0 

0.2 

II 

0.057 

0.059 

0.099 

0.309 

0.815 

E 

0.052 

0.067 

0.131 

0.405 

0.910 

I 

0.046 

0.056 

0.089 

0.227 

0.773 

5x10 

5.0 

5.0 

II 

0.051 

0.060 

0.107 

0.275 

0.834 

E 

0.052 

0.067 

0.131 

0.405 

0.910 
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